14 research outputs found

    The Interdisciplinarity of Scientific Research Data

    Get PDF
    Technical advances have lowered some barriers to data sharing and reuse, but it is a sociotechnical phenomenon and the impact of the ongoing evolution in scholarly communication practices has yet to be actively quantified. With the open science movement, research data citation for data sharing and reuse is becoming more common than before. Furthermore, there is need for a deeper and more nuanced understanding of the extent of interdisciplinarity of data citation when research data is shared and reused. The interdisciplinary collaboration is closely related to data reuse across disciplines because disciplines influence one another. Collaboration is one way and citation is another. Citation is commonly considered to be closely related to scientific impact because citation measures formal scholarly impact. This study examined the interdisciplinarity of scientific research data, especially how scientific research data are reused in bibliographies. The researcher measured the variety, balance and diversity to examine to what extent scientific research data is reused in other disciplines. This study found that the interdisciplinarity of scientific research data is existent although the prevalence of interdisciplinarity is diverse depending on scientific disciplines. The findings presented here contribute to the study of interdisciplinarity of scientific research data for data sharing and reuse

    The Impact of Research Data Sharing and Reuse on Data Citation in STEM Fields

    Get PDF
    Despite the open science movement and mandates for the sharing of research data by major funding agencies and influential journals, the citation of data sharing and reuse has not become standard practice in the various science, technology, engineering and mathematics (STEM) fields. Advances in technology have lowered some barriers to data sharing, but it is a socio-technical phenomenon and the impact of the ongoing evolution in scholarly communication practices has yet to be quantified. Furthermore, there is need for a deeper and more nuanced understanding of author self-citation and recitation, the most often cited types of data, disciplinary differences regarding data citation and the extent of interdisciplinarity in data citation. This study employed a mixed methods approach that combined coding with semi-automatic text-searching techniques in order to assess the impact of data sharing and reuse on data citation in STEM fields. The research considered over 500,000 open research data entities, such as datasets, software and data studies, from over 350 repositories worldwide. I also examined 705 bibliographic publications with a total of 15,261 instances of data sharing, reuse, and citation the data, article, discipline and interdisciplinary levels. More specifically, I measured the phenomenon of data sharing in terms of formal data citation, frequently cited data types, and author self-citation, and I explored recitation at the levels of both data- and bibliography-level, and data reuse practices in bibliographies, associations of disciplines, and interdisciplinary contexts. The results of this research revealed, to begin with, disciplinary differences with regard to the impact of data sharing and reuse on data citation in STEM fields. This research also yielded the following additional findings regarding the citation of data by STEM researchers; 1) data sharing practices were diverse across disciplines: 2) data sharing has been increasing in recent years; 3) each discipline made use of major digital repositories; 4) these repositories took various forms depending on the discipline; 5) certain data types were more often cited in each discipline, so that the frequency distribution of the data types was highly skewed; 6) author self-citation and recitation followed similar trends at the data and bibliographic levels, but specific practices varied within each discipline; 7) associations between and across data and author self-citation and recitation at the bibliographic level were observed, with the self-citation rate differing significantly among disciplines;8) data reuse in bibliographies was rare yet diverse; 9) informal citation of data sharing and reuse at the bibliographic level was more common in certain fields, with astronomy/physics showing the highest amount (98%) and technology the lowest (69%); 10) within bibliographic publications, the documentation of data sharing and reuse occurred mainly in the main text; 11) publications in certain disciplines, such as chemistry, computing and engineering, did not attract citations from more than one field (i.e., showed no diversity); and, on the other hand,12) publications in other fields attracted a wide range of interdisciplinary data citations. This dissertation, then, contributes to the understanding of two key areas aspects of the current citation systems. First, the findings have practical implications for individual researchers, decision makers, funding agencies and publishers with regard to giving due credits to those who share their data. Second, this research has methodological implications in terms of reducing the labor required to analyze the full text of associated articles in order to identify evidence of data citation

    An Examination of Research Data Sharing and Re-Use: Implications for Data Citation Practice

    Get PDF
    This study examines characteristics of data sharing and data re-use in Genetics and Heredity, where data citation is most common. This study applies an exploratory method because data citation is a relatively new area. The Data Citation Index (DCI) on the Web of Science was selected because DCI provides a single access point to over 500 data repositories worldwide and to over two million data studies and datasets across multiple disciplines and monitors quality research data through a peer review process. We explore data citations for Genetics and Heredity, as a case study by examining formal citations recorded in the DCI and informally by sampling a selection of papers for implicit data citations within publications. Citer-based analysis is conducted in order to remedy self- citation in the data citation phenomena. We explore 148 sampled citing articles in order to identify factors that influence data sharing and data re-use, including references, main text, supplementary data/information, acknowledgments, funding information, author information, and web/author resources. This study is unique in that it relies on a citer-based analysis approach and by analyzing peer-reviewed and published data, data repositories, and citing articles of highly productive authors where data sharing is most prevalent. This research is intended to provide a methodological and practical contribution to the study of data citation

    Evaluation of Mappings from MARC to Linked Data

    Get PDF
    The purpose of this study is to assess the quality and compatibility of library linked data (LLD) schemas in use or proposed for library resources. Linked Data (LD) has the potential to provide high quality metadata on the web with the ability to incorporate existing structured data from MARC via a mapping.  Researchers selected representative libraries such as Harvard University Library, LC BIBFRAME (Library of Congress Bibliographic Framework), OCLC (Online Computer Library Canter) WorldCat, and National Library of Spain. For LD frameworks, four resources are matched into specific categories with MARC (MAchine-Readable Cataloging) tags so that it could be retrieved in both OCLC LD and BIBFRAME with the conversion tool at bibframe.org: (1) Classic, ebook,and fiction, (2) multiple authors and part of a series, and non-fiction, (3) varying title, translation, and fiction, and (4) sub title, non-fiction. This study revealed that the choices and elements of each library made in local decisions might bring interoperability issues for LD services due to the quality metadata creation issues

    From industry to scholarly communication: biometric literature over time

    Get PDF
    This study investigated the influence between industry and scholarly communication with the comparative analysis of patterns using a large scale dataset on biometric information. To identify whether patterns of cutting edge development in industry affects, are affected by, and/or are studied in parallel with scholarly communication over time, trending topics, word frequency occurrences, and temporal burst detection over time were conducted to assess prominent terms. Patents published in USPTO from 1790 to 2014 were analyzed to represent industry, and published documents such as peer-reviewed journals, conference proceedings and ebooks from both Thomson Reuters Web of Science and IEEE Xplore were analyzed to represent scholarly communication. The results of this study revealed that (1) there are matching trends in the number of publications, (2) transformation points in time are detected using the temporal burst analysis, and (3) patterns of cutting edge developments in industry might not affect, be affected by, and/or develop in parallel with scholarly communication over time in biometric literature.ye

    Formalised data citation practices would encourage more authors to make their data available for reuse

    Get PDF
    It is increasingly common for researchers to make their data freely available. This is often a requirement of funding agencies but also consistent with the principles of open science, according to which all research data should be shared and made available for reuse. Once data is reused, the researchers who have provided access to it should be acknowledged for their contributions, much as authors are recognised for their publications through citation. Hyoungjoo Park and Dietmar Wolfram have studied characteristics of data sharing, reuse, and citation and found that current data citation practices do not yet benefit data sharers, with little or no consistency in their format. More formalised citation practices might encourage more authors to make their data available for reuse

    Open Peer Review: The Current Landscape and Emerging Models

    Get PDF
    Open peer review (OPR) is an important innovation in the open science movement. OPR can play a significant role in advancing scientific communication by increasing its transparency. Despite the growing interest in OPR, adoption of this innovation since the turn of the century has been slow. This study provides the first comprehensive investigation of OPR adoption, its early adopters and the implementation models used. We identified 174 current OPR journals and analysed their wide-ranging implementations to derive emerging OPR models. The findings suggest that: 1) there has been a steady growth in OPR adoption since 2001 when 38 journals initially adopted OPR; 2) OPR adoption is most prevalent in medicine and the natural sciences; 3) three publishers are responsible for 87% of identified OPR journals; 4) early adopter publishers have implemented different models of OPR resulting in different levels of transparency. Across the variations in OPR implementations, two important factors define the degree of transparency: open identities and open reports. Open identities may include reviewer names and affiliation as well as credentials; open reports may include timestamped review histories consisting of referee reports and author rebuttals. When and where open reports can be accessed are also important factors indicating the OPR transparency level. Dimensions that characterize the observed OPR models are outlined

    Implications of data sharing on formal data citation in biomedical fields

    No full text
    Formal data citation recognizes data sharing sources and is itemized in the references section of published bibliographies. Current practice of formal data citing is becoming problematic because its use is not widely adopted. This study examined research data sharing in data repositories within the biomedical field. The study also explored how research data are documented and formally cited (i.e., formal references in the published bibliographies). Data were collected from major data sharing repositories commonly used in biomedical fields. Uniform Resource Locators (URLs) were found more widely used as data identifiers rather than Digital Object Identifiers (DOIs) which is concerning since URLs are less stable than DOIs. The rate of data sharing in the Data Citation Index (DCI) and the rate of formal data citation in bibliographic references in the Web of Science (WoS) corresponds to the most practices in data repositories in biomedical fields. The contribution of this study is providing insight into the form and use of formal data citation in scholarly communications so that data sharers receive appropriate acknowledgment and formal scholarly credit consistently
    corecore